Cache Efficient Bloom Filters for Shared Memory Machines
نویسنده
چکیده
Bloom filters are a well known data-structure that supports approximate set membership queries that report no false negatives. Each element in the universe represented by the bloom filter is associated with k random bits in the structure. Traditional bloom filters, therefore, require k non-local memory operations to insert an element or perform a lookups. For very large bloom filters, these k lookups may require k disk seeks. Lookups can be expensive even for moderately sized filters which fit into main memory since k non-local memory accesses may result in L3, L2, and L1 cache misses. In this paper, we implement a cache-efficient blocked bloom filter that performs insertions and lookups while only accessing a small block of memory. We improve upon the implementation described by [4] by adapting dynamically to unbalanced assignment of elements to memory blocks. The end result is a bloom filter whose superior cache locality allows it to outperform a standard bloom filter on a shared memory machine even when it fits into main memory. This paper also surveys the design and analysis of three existing types of bloom filters: a standard bloom filter, a blocked bloom filter, and a scalable bloom filter. Ideas from these data structures will allow for the implementation of a cache efficient bloom filter which provides good memory locality. These data structures are used directly by our cache efficient bloom filter to obtain its properties.
منابع مشابه
Data Caching in Ad Hoc Networks using Bloom Filters
Data caching provides efficient data access by maintaining replicas of data in strategic parts of the network. However, current research in this area does not manage memory space of each node efficiently. We propose an improvement by considering Bloom filters, a fast, spaceefficient probabilistic method for looking up data. We compare the system the system performance with and without Bloom fil...
متن کاملTwo-tier Bloom filter to achieve faster membership testing
Introduction: Bloom filters [1] are a space-efficient, probabilistic data structure for representing a list of elements (for example, a list of strings). A Bloom filter is an array of m bits. A string is mapped into a Bloom filter by inputting it to a group of k hash functions resulting in k array positions. Each indexed array position is set to 1. A string is tested for membership by inputting...
متن کاملThe Power of 1 + α for Memory-Efficient Bloom Filters
This paper presents a cache-aware Bloom Filter algorithm with improved cache behavior and a lower false positive rates compared to prior work. The algorithm relies on the power-of-two choice principle to provide a better distribution of set elements in a Blocked Bloom Filter. Instead of choosing a single block, we insert new elements into the least-loaded of two blocks to achieve a low false-po...
متن کاملA Cache Architecture for Counting Bloom Filters: Theory and Application
Within packet processing systems, lengthy memory accesses greatly reduce performance. To overcome this limitation, network processors utilize many different techniques, for example, utilizing multilevel memory hierarchies, special hardware architectures, and hardware threading. In this paper, we introduce a multilevel memory architecture for counting Bloom filters. Based on the probabilities of...
متن کاملBuffered Bloom Filters on Solid State Storage
Bloom Filters are widely used in many applications including database management systems. With a certain allowable error rate, this data structure provides an efficient solution for membership queries. The error rate is inversely proportional to the size of the Bloom filter. Currently, Bloom filters are stored in main memory because the low locality of operations makes them impractical on secon...
متن کامل